skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Fan, Pengxiang"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Tremendous plant metabolic diversity arises from phylogenetically restricted specialized metabolic pathways. Specialized metabolites are synthesized in dedicated cells or tissues, with pathway genes sometimes colocalizing in biosynthetic gene clusters (BGCs). However, the mechanisms by which spatial expression patterns arise and the role of BGCs in pathway evolution remain underappreciated. In this study, we investigated the mechanisms driving acylsugar evolution in the Solanaceae. Previously thought to be restricted to glandular trichomes, acylsugars were recently found in cultivated tomato roots. We demonstrated that acylsugars in cultivated tomato roots and trichomes have different sugar cores, identified root-enriched paralogs of trichome acylsugar pathway genes, and characterized a key paralog required for root acylsugar biosynthesis,SlASAT1-LIKE(SlASAT1-L), which is nested within a previously reported trichome acylsugar BGC. Last, we provided evidence thatASAT1-Larose through duplication of its paralog,ASAT1, and was trichome-expressed before acquiring root-specific expression in theSolanumgenus. Our results illuminate the genomic context and molecular mechanisms underpinning metabolic diversity in plants. 
    more » « less
  2. Plants produce phylogenetically and spatially restricted, as well as structurally diverse specialized metabolites via multistep metabolic pathways. Hallmarks of specialized metabolic evolution include enzymatic promiscuity and recruitment of primary metabolic enzymes and examples of genomic clustering of pathway genes. Solanaceae glandular trichomes produce defensive acylsugars, with sidechains that vary in length across the family. We describe a tomato gene cluster on chromosome 7 involved in medium chain acylsugar accumulation due to trichome specific acyl-CoA synthetase and enoyl-CoA hydratase genes. This cluster co-localizes with a tomato steroidal alkaloid gene cluster and is syntenic to a chromosome 12 region containing another acylsugar pathway gene. We reconstructed the evolutionary events leading to this gene cluster and found that its phylogenetic distribution correlates with medium chain acylsugar accumulation across the Solanaceae. This work reveals insights into the dynamics behind gene cluster evolution and cell-type specific metabolite diversity. 
    more » « less
  3. Marshall-Colon, Amy (Ed.)
    Abstract Plant specialized metabolites mediate interactions between plants and the environment and have significant agronomical/pharmaceutical value. Most genes involved in specialized metabolism (SM) are unknown because of the large number of metabolites and the challenge in differentiating SM genes from general metabolism (GM) genes. Plant models like Arabidopsis thaliana have extensive, experimentally derived annotations, whereas many non-model species do not. Here we employed a machine learning strategy, transfer learning, where knowledge from A. thaliana is transferred to predict gene functions in cultivated tomato with fewer experimentally annotated genes. The first tomato SM/GM prediction model using only tomato data performs well (F-measure = 0.74, compared with 0.5 for random and 1.0 for perfect predictions), but from manually curating 88 SM/GM genes, we found many mis-predicted entries were likely mis-annotated. When the SM/GM prediction models built with A. thaliana data were used to filter out genes where the A. thaliana-based model predictions disagreed with tomato annotations, the new tomato model trained with filtered data improved significantly (F-measure = 0.92). Our study demonstrates that SM/GM genes can be better predicted by leveraging cross-species information. Additionally, our findings provide an example for transfer learning in genomics where knowledge can be transferred from an information-rich species to an information-poor one. 
    more » « less
  4. Plant specialized metabolism (SM) enzymes produce lineage-specific metabolites with important ecological, evolutionary, and biotechnological implications. UsingArabidopsis thalianaas a model, we identified distinguishing characteristics of SM and GM (general metabolism, traditionally referred to as primary metabolism) genes through a detailed study of features including duplication pattern, sequence conservation, transcription, protein domain content, and gene network properties. Analysis of multiple sets of benchmark genes revealed that SM genes tend to be tandemly duplicated, coexpressed with their paralogs, narrowly expressed at lower levels, less conserved, and less well connected in gene networks relative to GM genes. Although the values of each of these features significantly differed between SM and GM genes, any single feature was ineffective at predicting SM from GM genes. Using machine learning methods to integrate all features, a prediction model was established with a true positive rate of 87% and a true negative rate of 71%. In addition, 86% of known SM genes not used to create the machine learning model were predicted. We also demonstrated that the model could be further improved when we distinguished between SM, GM, and junction genes responsible for reactions shared by SM and GM pathways, indicating that topological considerations may further improve the SM prediction model. Application of the prediction model led to the identification of 1,220A. thalianagenes with previously unknown functions, each assigned a confidence measure called an SM score, providing a global estimate of SM gene content in a plant genome. 
    more » « less